14 research outputs found

    DIPBench: An Independent Benchmark for Data-Intensive Integration Processes

    Get PDF
    The integration of heterogeneous data sources is one of the main challenges within the area of data engineering. Due to the absence of an independent and universal benchmark for data-intensive integration processes, we propose a scalable benchmark, called DIPBench (Data intensive integration Process Benchmark), for evaluating the performance of integration systems. This benchmark could be used for subscription systems, like replication servers, distributed and federated DBMS or message-oriented middleware platforms like Enterprise Application Integration (EAI) servers and Extraction Transformation Loading (ETL) tools. In order to reach the mentioned universal view for integration processes, the benchmark is designed in a conceptual, process-driven way. The benchmark comprises 15 integration process types. We specify the source and target data schemas and provide a toolsuite for the initialization of the external systems, the execution of the benchmark and the monitoring of the integration system's performance. The core benchmark execution may be influenced by three scale factors. Finally, we discuss a metric unit used for evaluating the measured integration system's performance, and we illustrate our reference benchmark implementation for federated DBMS

    GCIP: Exploiting the Generation and Optimization of Integration Processes

    Get PDF
    As a result of the changing scope of data management towards the management of highly distributed systems and applications, integration processes have gained in importance. Such integration processes represent an abstraction of workflow-based integration tasks. In practice, integration processes are pervasive and the performance of complete IT infrastructures strongly depends on the performance of the central integration platform that executes the specified integration processes. In this area, the three major problems are: (1) significant development efforts, (2) low portability, and (3) inefficient execution. To overcome those problems, we follow a model-driven generation approach for integration processes. In this demo proposal, we want to introduce the so-called GCIP Framework (Generation of Complex Integration Processes) which allows the modeling of integration process and the generation of different concrete integration tasks. The model-driven approach opens opportunities for rule-based and workload-based optimization techniques

    Invisible Deployment of Integration Processes

    Get PDF
    Due to the changing scope of data management towards the management of heterogeneous and distributed systems and applications, integration processes gain in importance. This is particularly true for those processes used as abstractions of workflow-based integration tasks; these are widely applied in practice. In such scenarios, a typical IT infrastructure comprises multiple integration systems with overlapping functionalities. The major problems in this area are high development effort, low portability and inefficiency. Therefore, in this paper, we introduce the vision of invisible deployment that addresses the virtualization of multiple, heterogeneous, physical integration systems into a single logical integration system. This vision comprises several challenging issues in the fields of deployment aspects as well as runtime aspects. Here, we describe those challenges, discuss possible solutions and present a detailed system architecture for that approach. As a result, the development effort can be reduced and the portability as well as the performance can be improved significantly

    Model-Driven Development of Complex and Data-Intensive Integration Processes

    Get PDF
    Due to the changing scope of data management from centrally stored data towards the management of distributed and heterogeneous systems, the integration takes place on different levels. The lack of standards for information integration as well as application integration resulted in a large number of different integration models and proprietary solutions. With the aim of a high degree of portability and the reduction of development efforts, the model-driven development—following the Model-Driven Architecture (MDA)—is advantageous in this context as well. Hence, in the GCIP project (Generation of Complex Integration Processes), we focus on the model-driven generation and optimization of integration tasks using a process-based approach. In this paper, we contribute detailed generation aspects and finally discuss open issues and further challenges

    Vectorizing Instance-Based Integration Processes

    Get PDF
    The inefficiency of integration processes as an abstraction of workflow-based integration tasks is often reasoned by low resource utilization and significant waiting times for external systems. Due to the increasing use of integration processes within IT infrastructures, the throughput optimization has high influence on the overall performance of such an infrastructure. In the area of computational engineering, low resource utilization is addressed with vectorization techniques. In this paper, we introduce the concept of vectorization in the context of integration processes in order to achieve a higher degree of parallelism. Here, transactional behavior and serialized execution must be ensured.In conclusion of our evaluation, the message throughput can be significantly increased

    Invisible Deployment of Integration Processes

    Get PDF
    Abstract. Due to the changing scope of data management towards the management of heterogeneous and distributed systems and applications, integration processes gain in importance. This is particularly true for those processes used as abstractions of workflow-based integration tasks; these are widely applied in practice. In such scenarios, a typical IT infrastructure comprises multiple integration systems with overlapping functionalities. The major problems in this area are high development effort, low portability and inefficiency. Therefore, in this paper, we introduce the vision of invisible deployment that addresses the virtualization of multiple, heterogeneous, physical integration systems into a single logical integration system. This vision comprises several challenging issues in the fields of deployment aspects as well as runtime aspects. Here, we describe those challenges, discuss possible solutions and present a detailed system architecture for that approach. As a result, the development effort can be reduced and the portability as well as the performance can be improved significantly

    Message Indexing for Document-Oriented Integration Processes ABSTRACT

    No full text
    The integration of heterogeneous systems is still an evolving research area. Due to the complexity of integration processes, there are challenges for the optimization of integration processes. Message-based integration systems, like EAI servers and workflow process engines, are mostly documentoriented using XML technologies in order to achieve suitable data independence from the different and particular proprietary data representations of the supported external systems. However, such an approach causes large costs for single-value evaluations within the integration processes. At this point, message indexing, adopting extended database technologies, could be applied in order to reach better performance. In this paper, we introduce our message indexing structure MIX and discuss and evaluate immediate as well as deferred indexing concepts. Further, we describe preliminary adaptive index tuning techniques. 1

    Workload-based optimization of integration processes

    Get PDF
    The efficient execution of integration processes between distributed, heterogeneous data sources and applications is a challenging research area of data management. These integration processes are an abstraction for workflow-based integration tasks, used in EAI servers and WfMS. The major problem are significant workload changes during runtime. The performance of integration processes strongly depends on those dynamic workload characteristics, and hence workload-based optimization is important. However, existing approaches of workflow optimization only address the rule-based optimization and disregard changing workload characteristics. To overcome the problem of inefficient process execution in the presence of workload shifts, here, we present an approach for the workload-based optimization of instance-based integration processes and show that significant execution time reductions are possible

    U.: Workload-based optimization of integration processes

    No full text
    The efficient execution of integration processes between dis-tributed, heterogeneous data sources and applications is a challenging research area of data management. These in-tegration processes are an abstraction for workflow-based integration tasks, used in EAI servers and WfMS. The ma-jor problem are significant workload changes during run-time. The performance of integration processes strongly de-pends on those dynamic workload characteristics, and hence workload-based optimization is important. However, exist-ing approaches of workflow optimization only address the rule-based optimization and disregard changing workload characteristics. To overcome the problem of inefficient pro-cess execution in the presence of workload shifts, here, we present an approach for the workload-based optimization of instance-based integration processes and show that signifi-cant execution time reductions are possible
    corecore